Refactor raw job descriptions into a structured Technical Manifest (JSON)

freelancer.com 🟡 2026-04-29

🔹 Refactor raw job descriptions into a structured Technical Manifest (JSON)
👤 Client: 🇺🇸 COSTA, United States Member since 2017-11-02
💰 Price: $432 Average bid
🚩 Problem: Capture and structure web data from multiple sites into a well-organized Excel workbook.
📦 Existing: Not specified

Specifications:

[Target] - Websites to be scraped for various types of data such as titles, text blocks, image URLs, product snippets, and contact information.
[Method] - Use Python with BeautifulSoup or Scrapy for web scraping. Ensure the final output is an Excel file with clean column headers and no duplicates.
[UI/UX] - Not applicable
[Stack] - Python (BeautifulSoup or Scrapy), pandas for data manipulation, openpyxl for Excel file creation
[Security] - Respect [robots.txt] and rate limits to avoid blocking sites. Ensure data privacy and security during processing.
[Format] - Deliver an executable script with inline comments and a completed Excel workbook.

Workflow:

1. Define the list of URLs to be scraped based on provided or inferred requirements.
2. Develop a Python script using BeautifulSoup or Scrapy to scrape each URL, extracting relevant data fields such as titles, text blocks, image URLs, product snippets, and contact information.
3. Clean and structure the extracted data into a pandas DataFrame for easy manipulation and formatting.
4. Write the cleaned data to an Excel file using openpyxl or similar library, ensuring column headers are clear and consistent, with no duplicate rows.
5. Add inline comments to the script for future reference and ease of rerunning the scrape.
6. Test the script against all provided URLs to ensure complete data capture without errors.
7. Validate that the Excel file opens correctly and all columns are populated as expected.

⚡ Receive notifications instantly Join our community.

Discord Telegram

Our Social Networks

LinkedIn Twitter Facebook

🕷️️ Job Radar • SCRAPING